Let’s create a new Rstudio project in which to work with:
In our new project, let’s create a data/ folder in which to store the data.
dir.create("data")
Let’s also open a new R script in which to work:
Save it in the project root, eg as metadata_dev.R
Let’s install and load all the packages we’ll need for the workshop:
install.packages("tidyverse")
install.packages("here")
install.packages("devtools")
devtools::install_github("ropenscilabs/dataspice")
library(tidyverse)
library(dataspice)
For more information on the data source check the tutorial README
The readr::read_csv() allows use to download raw csv data from a URL.
URL for vst_mappingandtagging.csv at bit.ly/mapping_csv
URL for vst_perplotperyear.csv at bit.ly/perplot_csv
vst_mappingandtagging <- read_csv("https://raw.githubusercontent.com/annakrystalli/dataspice-tutorial/master/data/vst_mappingandtagging.csv")
vst_perplotperyear <- read_csv("https://raw.githubusercontent.com/annakrystalli/dataspice-tutorial/master/data/vst_perplotperyear.csv")
You can inspect any object in your environment in Rstudio using function View()
vst_mappingandtagging %>% View()
vst_perplotperyear %>% View()
write_csv(vst_mappingandtagging, here::here("data", "vst_mappingandtagging.csv"))
write_csv(vst_perplotperyear, here::here("data", "vst_perplotperyear.csv"))
create_spice()
This creates by default a metadata folder in your project’s data folder (although you can specify a different directory) containing 4 files in which to record your metadata.
Let’s start with a quick and easy one, the creators. We can open and edit the file using in an interactive shiny app using edit_creators
edit_creators()
Remember to click on Save when you’re done editing.
Before manually completing any details we can use dataspice’s dedicated function prep_access() to extract information required for the access.csv
prep_access()
Again, we can use function edit_access() to complete the final details required, namely the URL at which each dataset can be downloaded from. Use the URL from we donloaded each data file in the first place (hint ☝️)
We can also edit details such as the name field to something more informative if required.
Remember to click on Save when you’re done editing.
edit_access()
Before we start filling this table in, we can use some base R to extract some of the information we require. In particular we can use function range() to extract the temporal and spatial extents of our data.
range(vst_perplotperyear$date, vst_mappingandtagging$date)
## [1] "05/22/15" "11/18/15"
range(vst_perplotperyear$decimalLatitude)
## [1] 42.39229 44.06795
range(vst_perplotperyear$decimalLongitude)
## [1] -72.26573 -71.28145
edit_biblio()
data_files <- list.files(here::here("data"),
pattern = ".csv",
full.names = TRUE)
data_files
## [1] "/Users/Anna/Documents/workflows/workshops/dataspice-tutorial/data/vst_mappingandtagging.csv"
## [2] "/Users/Anna/Documents/workflows/workshops/dataspice-tutorial/data/vst_perplotperyear.csv"
data_files %>% purrr::map(~prep_attributes(.x))
edit_attributes()
write_spice()
## Parsed with column specification:
## cols(
## title = col_character(),
## description = col_character(),
## datePublished = col_date(format = ""),
## citation = col_character(),
## keywords = col_character(),
## license = col_character(),
## funder = col_character(),
## geographicDescription = col_character(),
## northBoundCoord = col_double(),
## eastBoundCoord = col_double(),
## southBoundCoord = col_double(),
## westBoundCoord = col_double(),
## wktString = col_character(),
## startDate = col_date(format = ""),
## endDate = col_date(format = "")
## )
## Parsed with column specification:
## cols(
## fileName = col_character(),
## variableName = col_character(),
## description = col_character(),
## unitText = col_character()
## )
## Parsed with column specification:
## cols(
## fileName = col_character(),
## name = col_character(),
## contentUrl = col_character(),
## fileFormat = col_character()
## )
## Parsed with column specification:
## cols(
## id = col_character(),
## givenName = col_character(),
## familyName = col_character(),
## affilitation = col_character(),
## email = col_character()
## )
build_site()